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AHSITlAi  r 

Hlesser  et  al.  (l)  and  Shillnian  et  ai.  (2)  have 
descritx  d a theory  and  techniques  for  designing  opti- 
cal character  recognition  algorithms  using  psycho- 
physical methodology.  lTu‘  previous  publications  have 
focused  on  the  methodology  used  to  determine  Physi- 
cal to  functional  Kules  (I’KKs)  whereas  the  present 
paper  describes  th»-  important  preliminary  steps  that 
precede  the  psychological  experiments.  These  steps 
include  the  choic«'  of  an  appropriate  letter  pair  for 
study,  the  formation  of  a hyjxithesis  regarding  relevant 
attributes  that  distinguish  the  letters  of  the  pair,  and 
the  design  of  stiniuli  for  th*'  subsequent  psychophysical 
ex[«  riments.  These  steps  are  described  in  the  context 
of  the  problem  of  d«'signing  an  algorithm  for  recog- 
nizing handprint«'d  2's  from  handprinted  Z's. 


1.  INTUODUCTK'N 

•Although  the  basic  idea  of  having  a machine  rec- 
ognize printed  characters  has  bet  n seriously  discus- 
sed for  over  90  years,  lh»‘  ultimate  machine  that  would 
recognr/i'  unconstrained  handprinted  characters  at 
error  rates  comparable  to  human  performance  has  yet 
to  !)«■  built.  The  experience  of  the  past  has  shown  us 
that  0|itical  Gharacter  Kecognition  (OCHl  is  indeed  a 
difficult  probb'm:  the  stagi’s  occurring  between  the 
insertion  of  a document  and  the  recognition  of  the  let- 
ti-rs  on  that  document  involve  complex  aspects  of  me- 
chanics, optics,  electronics  and  psychophysics.  Each 
-lage  has  Its  associated  problems  along  with  con- 
strained solutions  that  may  affect  the  design  and  per- 
formance of  later  stages.  for  example,  if  fixed 
thresholds  are  set  for  character  binarization.  then 
variations  in  printing  blackness  or  photodetector  sen- 
sitivity will  degrade  machine  performance  indepen- 
dently of  the  sfiphistication  of  the  subsequent  recogni- 
tion stage.  This  indicates  that  there  must  be  some 
degree  of  InUmaction  between  various  stages  in  order 
to  optimize  perforniance. 

Thus  we  see  that  there  are  many  stages  in  the 
CX.'H  process,  and  that  coupling  must  eventually  be 
present  between  the  various  stages  in  order  to  opti- 
mize overall  machine  performance.  We  have,  how- 
ever, focused  our  efforts  primarily  on  the  design  and 
optimization  of  the  recognition  algorithms  with  the 
exp«*ctation  that  an  understanding  of  the  issues  involved 
in  this  final  stage  of  the  (XJH  process  will  yield  insight 
into  the  sp«-cification  and  design  of  the  preceding 
stages. 

II.  DIKKIGULT  CASES 

If  each  letter  and  numeral  occurred  in  only  a few 
well  -defined  shap«’8,  then  machine  recognition  of  these 
characters  would  be  trivial.  Unconstrained  handprinted 
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characters,  however,  occur  in  an  infinite  variety  and 
are  therefore  not  easily  recognized  by  machine.  There 
are  different  approaches  one  might  take  when  first 
faced  with  the  issue  of  designing  an  OC'K  algorithm; 
for  example,  one  might  focus  on  the  m.ijority  of  A's  in 
a given  data  set  and  attempt  to  describe  them  in  a for- 
mal way.  This  approach  is  problematic  in  that  the  | 
descriptions  arrived  at  for  two  different  letters  may 
in  the  end  !«■  identical  (for  example,  all  A's  might  be 
described  as  having  a closure  at  the  top  and  two  de- 
scending legs,  a description  that  also  applies  to  all 
It's).  These  cases  must  then  be  disambiguated  through 
modification  of  feature  weights  or  by  introduction  of 
additional  features.  Rather  than  save  this  step  for 
last,  our  approach  has  been  to  focus  directly  on  these 
problem  cases  rather  than  on  typical  letter  shapes. 
Once  the  difficult  cases  are  analyzed,  the  remaining 
cases  are  expected  to  be  straightforward  (e.  g.  , there 
is  little  problem  in  devising  a simple  algorithm  for 
disting\iishing  M's  from  T's).  Previous  publications 
|3|  have  detailed  the  psychophysical  techniques  em- 
ployed for  studying  the  problem  cases  such  as  Vs  and 
A''s,  U's  and  Vs,  etc.  The  approach  has  recently  led 
to  a successful  low-error-rate  computer  algorithm 
for  the  recognition  of  unconstrained  handprinted  U's 
from  Vs.  which  are  among  the  most  difficult  charac- 
ters for  both  humans  and  machines  to  identify  cor- 
rectly 14  |. 

III.  IN  SEARCH  OF  PIIYSICAI.  TO 
FUNCTIONAI.  RUI.ES 

The  PER 

A Physical  to  Functional  Rule  (PFR)  is  a mapping 
from  the  physical  domain,  which  is  the  graphical 
image  of  a character,  to  the  functional  domain  from 
which  the  character's  label  can  be  determined.  The 
only  PFR  that  has  been  thoroughly  examined,  vali- 
dated, and  reported  is  for  the  attribute  I. EG  which 
distinguishes  lietween  the  letters  of  the  pairs  (called 
confusion  pairs)  shown  in  Fig.  1 . 


VXCF.DP.UH.m 

Fig.  1.  Characters  distinguished  hy  the  attribute  I. EG. 


In  a neutral  context,  the  PFR  for  LEG  is  given  by: 

Present 

% 

Functional  LEG:  ^ 0.  17 

Not  present 

where  f j = length  of  physical  leg.  and  I.  = length  of 
line  of  which  the  leg  is  a part.  This  rule  indicates 
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tliat  if  tlir  tfiicth  of  the  phvsical  Ion  is  ('roati  r than 
I '’ll  of  till  lonctli  of  till  stroki'  of  which  it  is  a part, 
then  the  characters  would  most  likely  he  labeled  as  Y, 
F.  r.  II  and  A. 

rile  remainder  of  this  paper  desenhes  tin-  impor- 
tant I'rehminarv  ste|is  that  precede  the  psychophysical 
experiments  coiulucted  to  determine  I’FHs  such  as  the 
one  we  have  ijiven.  Tliese  steps  are:  choosinfi  an 

appropriate  letter-pair  for  study,  forminp  an  initial 
hvjxithcsis  repardinr  relevant  attributes  that  distin- 
guish the  li’tters  of  the  pair,  and  desiKnmft  the  stimuli 
for  a psychophysical  experiment  to  test  the  hypothesis 
and  gem  rati'  a I’l'K. 

I'hoosing  a Confusion  Pair 

The  first  step  in  the  search  for  a PFK  is  to  choose 
a pair  of  characters  that  are  often  or  easily  confused 
with  each  other.  Neisser  and  Weene  [5|  studied  human 
recognition  of  handprinted  characters,  and  showed,  for 
example,  that  many  handprinted  2' s are  misrecognized 
as  handprinted  /'s.  .After  the  likelihood  of  occurrence 
of  various  confusion  pairs  has  been  determined,  it 
may  he  noted  that  the  [xitential  confusion  in  a number 
of  pairs  appears  to  result  from  a common  cause,  and 
hence  the  study  of  a )>articular  pair  may  yield  insight 
into  other  confusion  pairs;  for  example,  information 
on  2-7.  discrimination  may  he  useful  in  understanding 
the  confusion  pairs:  l-V,  S-S,  and  perhaps  in  dis- 

criminating script  n ('ll)  from  handprinted  M (»'). 

Initial  Hypothesis  Formation 

When  handprinted,  the  numeral  2 occurs  in  a large 
variety  of  forms,  as  opposed  to  the  tetter  /.,  which 
has  one  basic  form.  Tlie  next  step  is  to  examine  those 
2' s that  lie  ni'ar  the  2-7.  interletter  Ixiunilary  in  an 
attempt  to  gain  insight  into  the  relevant  attributes.  We 
can  focus  on  Ixiundary  cases  by  con.structing  an  inter- 
letter  traii'Ctorv  from  the  archetvpal  shape  of  one  let- 
ter to  the  archetvpal  shape  of  the  other  letter  of  the 
confusion  yiair.  .Althougli  generally  straightforward, 
this  process  is  conifilicated  in  the  2-7.  case  by  the  fact 
that  although  there  is  one  accepted  7.  archetype,  there 
are  a number  of  different  " standard*  forms  for  tne 


handprinted  numeral  2 (foi 

!•  example,  7 

, 2 and  J.  ) 

^ 1 (Rli  1 r 

O 

o 

^ ^ 

C-._ 

A - . > 

^ . i. 

Op^n 
) OPp 

c loirtj 
Jof'p 

crovsrd  tppr 
1 OOP 

2 

% ir.pl  y 
Cut  *rij 

ij  0 u t>  1 y 
c wr  j 

r ' 

1 

T«ff*  / 

/ 

C>s 

\ nop 

1 

cur«rd 

ilout) ) / 

Curved 

Fig.  2.  Kegions  of  the  character  2. 


A Ithough  these  forms  may  have  all  evolved  from  one 
archetype,  several  forms  now  occur  with  sufficient 
frequency  to  be  considered  archetvpal  [6|.  In  his  very 
thorough  study  of  numerals,  Wright  [u]  suggests  that 
the  numeral  2 can  be  studied  by  segmenting  it  into 
four  ri-gions:  the  turn,  base,  stem  and  head.  F.ach  of 

these  segments  can  occur  in  a variety  of  forms  sum- 
marized descriptively  in  Fig.  2,  and  examined  in  the 
following  text. 

a.  The  turn 

For  the  purpose  of  2-7.  recognition,  the  turn  can 
be  categorized  as  looped  or  simple.  This  categoriza- 
tion IS  useful  because  a looped  turn  is  extremely  in- 
frequent in  handprinted  samples  of  the  letter  7.  and 
does  not  occur  near  the  2-7.  interletter  Ixiundary,  and 
IS  therefore  not  helpful  in  defining  the  recognition 
boundary  between  2' s and  7.’ s.  This  does  not  mean 
that  this  portion  of  the  character  would  not  be  used  in 
a recognition  algorithm;  on  the  contrary,  the  presence 
of  a looped  turn  is  a strong  indication  that  the  charac- 
ter is  a 2 in  much  the  same  way  that  a horizontal 
crossbar  is  a strong  indicator  of  Z-ness;  these  em- 
bellishments could  be  searched  for  and  utilized  in  the 
initial  stages  of  the  OCR  decision  process. 

b.  The  base 

Although  Fig.  2 indicates  that  there  may  be  four 
regions  where  information  about  the  character's  label 
may  be  located,  the  base  appears  not  to  contain  any 
information  relevant  to  2-Z  discrimination.  Figure  3 
shows  that  the  base  alone  docs  not  determine  letter 
label:  for  any  chosen  base,  the  character  can  be  com- 

pleted so  as  to  become  either  a 2 or  a Z.  A more 

Z 2 

Z 2 

Z 2 

Fig.  3.  Negligible  effect  of  the  base  on  character 
label. 

formal  method  of  proving  that  the  base  does  not  con- 
tain a Functional  Attribute  is  to  note  tliat  it  does  not 
satisfy  the  primary  definition  of  this  attribute  which 
requires  that  it  must  be  possible  to  alter  a character's 
identity  along  a trajectory  by  varying  the  functional 
attribute.  In  this  case,  the  test  requires  that  an  inter- 
letter  2-Z  trajectory  exist  along  which  only  the  base 
is  varied;  since  no  such  trajectory  has  been  found,  the 
base  is  ruled  out  as  an  informative  region.  For  sim- 
plicity, therefore,  we  shall  initially  study  characters 
having  flat  bases  and  simple  turns. 

c.  The  stem 


The  three  types  of  stems  are  doubly  curved,  sim- 
ply curved,  and  straight.  Figure  4 shows  that  both  2»s 
and  Z's  can  be  constructed  to  have  any  of  the  three 
possible  stems. 


I 

i 


2 

I 2 

I'lg.  4.  NogligibU’  effect  of  tile  stem  on  character 
label. 

Although  the  Z’ s with  douhly  curved  or  simply 
curved  stems  are  not  very  go<id,  we  see  that  the  stem 
by  Itself  does  not  determine  letter  label.  It  appears 
that  the  doubly  curved  stem  is  least  Z-like,  whereas 
the  stiaight  stem  is  most  Z-like.  Because  there  is 
more  than  one  archetypal  2 shape,  the  fact  that  a dou- 
bly curved  stem  is  least  Z-like  does  not  necessarily 
mean  that  it  is  most  2-like;  in  fact,  all  three  types  of 
stems  occur  with  roughly  the  same  frequency  in  hand- 
printed samples  of  the  numeral  2 [7j. 


Z Z Z Z 2 

d,  — ► 


Z Z Z Z 2 

d •>  - » 


Kig.  5.  Kffects  of  increasing  double  curvature  along 
dj  and  single  curvature  along 

Figure  5 shows  two  trajectories  along  which  the 
stem  is  varied;  along  dimension  dj  the  stem  becomes 

increasingly  simply  curved.  The  postulated  decrease 
in  gtKidness  (ts|  along  both  trajectories  is  hypothesized 
to  be  due  to  two  independent  factors;  the  major  de- 
crease is  due  simply  to  Uie  degradation  of  the  charac- 
ter (movement  away  from  the  central  portion  of  the  Z 
space)  and  only  a small  portion  is  due  to  the  influence 
of  the  numeral  2 (movement  toward  the  2 space).  This 
effect  is  illustrated  in  Fig.  b which  shows  both  tra- 
jectories of  Fig.  5 replotted  on  the  2-Z  space.  Along 
either  of  these  two  trajectories,  the  characters  clear- 
ly become  much  less  Z-like  and  slightly  more  2-Uke; 
this  is  illustrated  by  the  movement  away  from  the  cen- 
ter of  the  Z space  and  only  slowly  toward  the  2 space. 

d.  The  head 

For  the  purpose  of  2-Z  discrimination,  the  five 
heads  described  by  Wright  can  be  categorized  as 
either  marked  (containing  loops  or  spurs)  or  simple. 
As  in  the  case  of  the  looped  turn,  the  presence  of  a 
marked  head  is  infrequent  in  the  letter  Z and  could 
thus  be  used  as  an  initial  test  for  character  label. 
Focusing  on  simple  heads,  it  can  be  noted  that  there 
are  two  regions  that  contribute  to  the  perceived  curva- 
ture, or  roundness,  of  the  head:  the  curvature  at  the 


Fig.  6.  Iteprescntation  of  the  2-Z  space  along  four 
dimensions. 


start  of  the  stroke  (in  the  upper  left  of  the  character) 
and  the  region  of  curvature  in  the  upper  right  corner. 
Figure  6 shows  variations  along  both  these  dimensions, 
dj  and  d^,  respectively;  although  the  stem  apparently 

has  some  effect  on  letter  label,  the  effect  along  either 
d|  or  dj  IS  thought  (and  is  pictured)  to  be  small  com- 
pared to  the  effects  along  dj  and  d_^;  this  indicates 

that  the  major  change  in  identity  is  due  to  changes  in 
the  head  of  the  character.  Figure  6 also  contains  a 
trajectory  directly  from  the  central  portion  of  the  Z 
space  to  the  2 space  along  d^  f d_^. 

The  hypothesized  2-Z  space  shown  in  Fig.  6 indi- 
cates the  complexity  of  the  2-Z  problem  which  is 
brought  about,  in  large  part,  by  the  presence  of  mul- 
tiple archetypal  shapes  of  the  numeral  2.  It  appears 
that  there  are  at  least  four  physical  dimensions  that 
are  involved  in  2-Z  discrimination.  One  can  traverse 
an  interletter  trajectory  from  an  archetypal  Z to  an 
archetypal  2 via  dj  f dj  d_j,  or  d^  + dj  d^,  or  via 

dj  d^.  Since  the  last  trajectory  is  involved  in  each 

of  the  other  two  and  since  its  effect  is  thought  to  be 
major,  it  seems  appropriate  to  investigate  initially 
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IV.  DISCUSSION 


the  (iirncfiKKmw  d,  .mfl  d.. 

5 4 


S 1 1 n ui lus  Dri^ign 

Ha.scd  on  thr  foregoing  analysis,  the  2's  initially 
chosen  for  study  have  simple  heads,  straight  stems, 
and  flat  bases.  The  physical  variables  involved  in  the 
constructed  trajectories  were  chosen  to  affect  the  per- 
ceived curvature  of  the  head  of  the  character.  Each 
character  has  two  curved  and  three  straight  segments, 
as  shown  in  Kig.  7. 


The  two  curved  segments  are  arcs  tangent  at 
(Kiints  2,  I,  and  ^ to  th<-  corre.sponding  straight  seg- 
ments. Horizontal  alignment  is  maintained  l>etween 
points  1 and  6 and  l>etween  points  4 and  7 such  that 
th<'  entire  character  just  fits  in  a rectangle  of  constant 
height  and  width  The  two  physical  variables,  U,  the 
distance  of  point  1 lielow  the  top,  and  HI,  the  radius 
of  the  second  arc,  are  varied  while  point  2 is  kept 
centered  horizontally. 

Tor  given  values  of  I),  It2  is  found  to  construct  an 
arc  froni  (xiint  1 to  point  2.  l-'or  given  values  of  Kl, 
point  S is  found  to  satisfy  tangency  and  construct  an  arc 
from  point  i to  point  S.  Figure  8 shows  a two- 
dimensional  2-Z  tr.ajpctory  plotted  on  a Calcomp  “>63 
line  plotter  with  a .3  mm  Mars  technical  pen.  R1  varies 
linearly  along  the  horizontal  and  D varies  linearly  along 
the  vertical. 


Figure  8 indicates  that  both  physical  variables 
chosen  do  influence  latx’ling:  it  cannot  yi’t  be  deter- 
mined, however,  whether  there  are  two  functional 
attributes  involved,  each  of  which  takes  on  as  its  argu- 
ment one  of  the  physical  variables,  or  whether  there 
is  one  functional  attribute  which  takes  on  Ixith  vari- 
ables as  arguments.  It  does  appear  that  the  curvature 
in  the  right-hand  ctirner  of  the  head  has  somewhat 
greater  importance:  Fig-  ^ shows  that  if  there  is  an 
angular  bend  in  the  right  corner,  then  no  .amount  of 
curvature  at  the  start  of  the  stroke  and  no  amount  of 
2-ness  in  the  stem  or  base  of  the  character  will  lx- 
sufficient  to  force  the  character  to  be  a 2.  Note  that 


Fig.  9.  Dominant  effect  of  an  angular  bend  in  the  top 
right  corner. 


even  the  addition  of  a strong  2 indicator,  a looped  turn, 
is  not  sufficient  to  create  a 2 if  the  upper  corner  is 
angular.  Studying  the  trajectories  in  Fig.  8 it  appears 
that  the  issue  of  segmentation  (8|  may  lie  involved;  if 
the  head  of  the  character  is  composed  of  one  functional 
segment,  then  the  character  is  a 2 whereas  if  it  is 
composed  of  two  functional  segments,  then  it  is  a 7.. 

It  may  lx‘  that  the  I’FK  determining  the  number  of 
functional  segments  will  take  on  as  its  arguments  tioth 
the  curvature  at  thr  .start  of  the  stroke  and  the  curva- 
ture in  the  upper  right  corner. 

The  working  hypotheses  developed  in  this  pa|ier 
are  listed  Ix'low: 

(1)  2-7.  discrimination  is  complicated  by  the 
presence  of  multiple  archetypes  of  the  numer- 
al 2. 

(2)  There  are  a numlier  of  features  involved  in 

2-7  discrimination;  in  decreasing  order  of 
importance  they  are:  (a)  the  curvature  in  the 

upper  right-hand  corner  of  the  head,  (b)  the 
curvature  at  the  start  of  the  stroke,  and 
(cl  the  nature  of  the  stem,  whether  straight, 
simply  curved,  or  doubly  curved. 

(31  The  base  is  not  an  informative  region  for  2-7 
discrimination 

(41  Embellishments  such  as  a horizontal  cross- 
bar in  the  7 or  marked  head  and/or  a looped 
turn  in  the  2 are  strong  clues  for  determining 
identity  and  can  possibly  be  treated  separately 
from  the  features  described  in  (2). 

(5)  In  the  absence  of  embellishments,  it  may  be 
that  the  character  is  a 2 if  the  head  is  com- 
posed of  one  functional  segment  and  a Z if  it 
is  composed  of  two  functional  segments.  The 
HF'R  determining  the  number  of  functional 
segments  may  be  a function  of  the  three  fea- 
tures listed  in  (2). 
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Fig.  8. 

A two. 

• dimensional 

2-Z  trajectory. 

We  are  now  undertaking  psychophysical  experi- 
ments to  test  these  hypotheses. 
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